准确的负载预测对于电力系统的电力市场运营以及电力系统中的其他实时决策任务至关重要。本文认为社区内的住宅客户的短期负荷预测(STLF)问题。现有的STLF工作主要侧重于预测馈线系统或单一客户的汇总负荷,但是在预测单个设备水平的负荷上,已经努力。在这项工作中,我们介绍了一种用于有效预测各个电器的功耗的STLF算法。所提出的方法在深度学习中强大的经常性神经网络(RNN)架构,称为长短短期记忆(LSTM)。当每个设备具有唯一重复的消耗模式时,将跟踪预测误差的模式,使得过去的预测误差可用于提高最终预测性能。实际负载数据集的数值测试证明了在现有的基于LSTM的方法和其他基准方法上提高了所提出的方法。
translated by 谷歌翻译
Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or from IMU sensors worn on the body, effectively turning human movement in a "scanner" of the 3D world. Intuitively, human movement indicates the free-space in a room and human contact indicates surfaces or objects that support activities such as sitting, lying or touching. We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement. MIME uses an auto-regressive transformer architecture that takes the already generated objects in the scene as well as the human motion as input, and outputs the next plausible object. To train MIME, we build a dataset by populating the 3D FRONT scene dataset with 3D humans. Our experiments show that MIME produces more diverse and plausible 3D scenes than a recent generative scene method that does not know about human movement. Code and data will be available for research at https://mime.is.tue.mpg.de.
translated by 谷歌翻译
The Age-of-Information (AoI) metric has been widely studied in the theoretical communication networks and queuing systems literature. However, experimental evaluation of its applicability to complex real-world time-sensitive systems is largely lacking. In this work, we develop, implement, and evaluate an AoI-based application layer middleware that enables the customization of WiFi networks to the needs of time-sensitive applications. By controlling the storage and flow of information in the underlying WiFi network, our middleware can: (i) prevent packet collisions; (ii) discard stale packets that are no longer useful; and (iii) dynamically prioritize the transmission of the most relevant information. To demonstrate the benefits of our middleware, we implement a mobility tracking application using a swarm of UAVs communicating with a central controller via WiFi. Our experimental results show that, when compared to WiFi-UDP/WiFi-TCP, the middleware can improve information freshness by a factor of 109x/48x and tracking accuracy by a factor of 4x/6x, respectively. Most importantly, our results also show that the performance gains of our approach increase as the system scales and/or the traffic load increases.
translated by 谷歌翻译
Consider a scenario in one-shot query-guided object localization where neither an image of the object nor the object category name is available as a query. In such a scenario, a hand-drawn sketch of the object could be a choice for a query. However, hand-drawn crude sketches alone, when used as queries, might be ambiguous for object localization, e.g., a sketch of a laptop could be confused for a sofa. On the other hand, a linguistic definition of the category, e.g., a small portable computer small enough to use in your lap" along with the sketch query, gives better visual and semantic cues for object localization. In this work, we present a multimodal query-guided object localization approach under the challenging open-set setting. In particular, we use queries from two modalities, namely, hand-drawn sketch and description of the object (also known as gloss), to perform object localization. Multimodal query-guided object localization is a challenging task, especially when a large domain gap exists between the queries and the natural images, as well as due to the challenge of combining the complementary and minimal information present across the queries. For example, hand-drawn crude sketches contain abstract shape information of an object, while the text descriptions often capture partial semantic information about a given object category. To address the aforementioned challenges, we present a novel cross-modal attention scheme that guides the region proposal network to generate object proposals relevant to the input queries and a novel orthogonal projection-based proposal scoring technique that scores each proposal with respect to the queries, thereby yielding the final localization results. ...
translated by 谷歌翻译
Although prediction models for delirium, a commonly occurring condition during general hospitalization or post-surgery, have not gained huge popularity, their algorithmic bias evaluation is crucial due to the existing association between social determinants of health and delirium risk. In this context, using MIMIC-III and another academic hospital dataset, we present some initial experimental evidence showing how sociodemographic features such as sex and race can impact the model performance across subgroups. With this work, our intent is to initiate a discussion about the intersectionality effects of old age, race and socioeconomic factors on the early-stage detection and prevention of delirium using ML.
translated by 谷歌翻译
This paper presents a framework for jointly grounding objects that follow certain semantic relationship constraints given in a scene graph. A typical natural scene contains several objects, often exhibiting visual relationships of varied complexities between them. These inter-object relationships provide strong contextual cues toward improving grounding performance compared to a traditional object query-only-based localization task. A scene graph is an efficient and structured way to represent all the objects and their semantic relationships in the image. In an attempt towards bridging these two modalities representing scenes and utilizing contextual information for improving object localization, we rigorously study the problem of grounding scene graphs on natural images. To this end, we propose a novel graph neural network-based approach referred to as Visio-Lingual Message PAssing Graph Neural Network (VL-MPAG Net). In VL-MPAG Net, we first construct a directed graph with object proposals as nodes and an edge between a pair of nodes representing a plausible relation between them. Then a three-step inter-graph and intra-graph message passing is performed to learn the context-dependent representation of the proposals and query objects. These object representations are used to score the proposals to generate object localization. The proposed method significantly outperforms the baselines on four public datasets.
translated by 谷歌翻译
We consider stochastic gradient descents on the space of large symmetric matrices of suitable functions that are invariant under permuting the rows and columns using the same permutation. We establish deterministic limits of these random curves as the dimensions of the matrices go to infinity while the entries remain bounded. Under a "small noise" assumption the limit is shown to be the gradient flow of functions on graphons whose existence was established in arXiv:2111.09459. We also consider limits of stochastic gradient descents with added properly scaled reflected Brownian noise. The limiting curve of graphons is characterized by a family of stochastic differential equations with reflections and can be thought of as an extension of the classical McKean-Vlasov limit for interacting diffusions. The proofs introduce a family of infinite-dimensional exchangeable arrays of reflected diffusions and a novel notion of propagation of chaos for large matrices of interacting diffusions.
translated by 谷歌翻译
在带有多个扬声器的视频中,主动扬声器检测(ASD)是一项具有挑战性的任务,因为它需要在长时间的暂时窗口上学习有效的视听功能和时空相关性。在本文中,我们提出了一种新颖的时空图形学习框架,可以解决复杂的任务,例如ASD。为此,视频框架中的每个人首先在该框架的唯一节点中编码。对应于跨帧的单个人的节点已连接以编码其时间动力学。帧中的节点也连接到编码人际关系。因此,咒语将ASD减少到节点分类任务。重要的是,咒语能够在所有节点上为所有节点上的长时间环境推理,而无需依赖计算昂贵的完全连接的图形神经网络。通过对Ava-Activespeaker数据集进行的广泛实验,我们证明了基于图形的表示形式可以显着改善主动扬声器检测性能,因为其明确的空间和时间结构。拼写优于所有先前的最新方法,同时需要大大降低内存和计算资源。我们的代码可在https://github.com/sra2/spell上公开获取
translated by 谷歌翻译
在弱监督学习(WSL)中,对从语义规则和特定于任务的预训练模型获得的嘈杂标签进行了训练。规则对任务的概括有限,并且需要大量的手动工作,而预培训模型仅适用于有限任务。在这项工作中,我们建议利用基于及时的方法作为弱来源,以获取未注释数据的嘈杂标签。我们表明,任务不合时宜的提示是可以推广的,可用于获取用于不同口语理解(SLU)任务的嘈杂标签,例如情感分类,不足的检测和情感分类。这些提示还可以更新以添加特定于任务的上下文,从而为设计特定于任务的提示提供灵活性。我们证明,基于及时的方法为上述SLU任务生成可靠的标签,因此可以用作通用弱源在没有标记数据的情况下训练弱监督模型(WSM)。我们提出的WSL管道对基于迅速的弱源进行了训练,在所有三个基准SLU数据集上,对零F1的零型学习和少量学习的其他竞争性低资源基准优于其他竞争性低资源基准。所提出的方法还优于传统的基于规则的WSL管道在宏F1上的表现超过5%。
translated by 谷歌翻译
尽管Shapley值为DNN模型预测提供了有效的解释,但该计算依赖于所有可能的输入特征联盟的枚举,这导致了指数增长的复杂性。为了解决这个问题,我们提出了一种新颖的方法剪切,以显着加速DNN模型的Shapley解释,其中计算中只有几个输入特征的联盟。特征联盟的选择遵循我们提出的Shapley链规则,以最大程度地减少地面shapley值的绝对误差,从而使计算既有效又准确。为了证明有效性,我们全面评估了跨多个指标的剪切,包括地面真相shapley价值的绝对误差,解释的忠诚和跑步速度。实验结果表明,剪切始终优于不同评估指标的最先进的基线方法,这证明了其在计算资源受到限制的现实应用程序中的潜力。
translated by 谷歌翻译